Inference in latent factor regression with clusterable features

نویسندگان

چکیده

Regression models, in which the observed features X∈Rp and response Y∈R depend, jointly, on a lower dimensional, unobserved, latent vector Z∈RK, with K≪p, are popular large array of applications, mainly used for predicting from correlated features. In contrast, methodology theory inference regression coefficient β∈RK relating Y to Z scarce, since typically un-observable factor is hard interpret. Furthermore, determination asymptotic variance an estimator β long-standing problem, solutions known only few particular cases. To address some these outstanding questions, we develop inferential tools class models signed mixtures factors. The model specifications both practically desirable, render interpretability components Z, sufficient parameter identifiability. Without assuming that number factors K or structure mixture advance, construct computationally efficient estimators β, along other important parameters. We benchmark rate convergence by first establishing its ℓ2-norm minimax bound, show our proposed βˆ minimax-rate adaptive. Our main contribution provision unified analysis component-wise Gaussian distribution and, especially, derivation closed form expression variance, together consistent estimators. resulting can be when p independent sample size n, also both, either, vary while allowing p>n. This complements normality results obtained case under consideration, regime K=O(1) p→∞, but without estimate. As application, provide, within specifications, statistical platform cluster centers, thereby increasing scope theoretical results. newly developed recently collected data set study effectiveness new SIV vaccine. enables top antibody-centric mechanisms associated vaccine response.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent factor regression models for grouped outcomes.

We consider regression models for multiple correlated outcomes, where the outcomes are nested in domains. We show that random effect models for this nested situation fit into a standard factor model framework, which leads us to view the modeling options as a spectrum between parsimonious random effect multiple outcomes models and more general continuous latent factor models. We introduce a set ...

متن کامل

Observed versus latent features for knowledge base and text inference

In this paper we show the surprising effectiveness of a simple observed features model in comparison to latent feature models on two benchmark knowledge base completion datasets, FB15K and WN18. We also compare latent and observed feature models on a more challenging dataset derived from FB15K, and additionally coupled with textual mentions from a web-scale corpus. We show that the observed fea...

متن کامل

Bayesian latent factor regression for functional and longitudinal data.

In studies involving functional data, it is commonly of interest to model the impact of predictors on the distribution of the curves, allowing flexible effects on not only the mean curve but also the distribution about the mean. Characterizing the curve for each subject as a linear combination of a high-dimensional set of potential basis functions, we place a sparse latent factor regression mod...

متن کامل

Modeling adverse birth outcomes via latent factor quantile regression

We describe a Bayesian quantile regression model that uses a factor structure for part of the design matrix. This model is particularly useful when the data comprise numerous indicators of underlying latent factors that analysts wish to include as covariates. We apply the model to a study of birth weights, for which the effects of covariates on the lower quantiles of the response distribution a...

متن کامل

Predictive Inference Using Latent Variables with Covariates.

Plausible values (PVs) are a standard multiple imputation tool for analysis of large education survey data, which measures latent proficiency variables. When latent proficiency is the dependent variable, we reconsider the standard institutionally generated PV methodology and find it applies with greater generality than shown previously. When latent proficiency is an independent variable, we sho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Bernoulli

سال: 2022

ISSN: ['1573-9759', '1350-7265']

DOI: https://doi.org/10.3150/21-bej1374